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ABSTRACT 

Flux ratio 'anomalies' in quadruply-imaged gravitational lenses can be explained with 
galactic substructure of the sort predicted by ACDM, but the strength and uniqueness 
of that hypothesis needs to be further assessed. A good way to do that is to use the 
' physical scale associated with the size of the source quasar, and its dependence on 

, wavelength. We develop a toy model to study finite source effects in substructure lens- 

ing. Treating substructure as a Singular Isothermal Sphere allows us to compute the 
images of a finite source analytically and then to explore how the image configurations 
and magnifications depend on source position and size. Although simplified, our model 
If} , yields instructive general principles: image positions and magnifications are basically 

' independent of source size until the source is large enough to intersect a substructure 

^ <— | caustic; even sources that are much larger than the substructure Einstein radius can be 

Q_i' perturbed at a detectable level; and most importantly, there is a tremendous amount 

to be learned from comparing image positions and magnifications at wavelengths that 
' correspond to different source sizes. 

, In a separate analysis, we carefully study four observed radio lenses to determine 

which of the images are anomalous. In B0712+472, the evidence for a radio flux ratio 
anomaly is marginal, but if the anomaly is real then image C is probably the culprit. 
In B1422+231, the anomaly is in image A. Interestingly, B2045+265 and B1555+375 
rS , both appear to have two anomalous images. Coincidentally, in each system one of 

' the anomalies is in image C, and the other is in either image A or image B (both 

possibilities lead to acceptable models). It remains to be seen whether ACDM predicts 
enough substructure to explain multiple anomalies in multiple lenses. When we finally 
join our modeling results and substructure theory, we obtain lower bounds on the 
masses of the substructures responsible for the observed anomalies. The mass bounds 
are broadly consistent with expectations for ACDM. Perhaps more importantly, we 
outline various systematic effects in the mass bounds; poor knowledge of whether 
the substructure lies within the main lens galaxy or elsewhere along the line of sight 
appears to be the dominant systematic. 
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1 INTRODUCTION 



While the ACDM c osmological scen ario has been quite successful in descri bing m easurements of cosmological structures 



on large scales (e.g.. Freemark et alJl20o3 iDodelson et aLlkuOl ISpereel et alJl2003ft. it seems to overpredict the number of 
galactic satellites by about an order of magnitude iKlyplrieridlll999nMo'oreet alJll99sT) . The discrepancy could be resolved 
by modifying d ark matter - mak i ng it warm, self-interacting, or otherwise exotic to reduce the predicted amount of small-scale 
structure fe.g.. lColm et al.1l200fll : IS pereel S^einhardt]^^ ^ . Another possibility is that star formation in low- mass haloes is 
suppressed by photoionization (e.g.. iBtilloc^^^ravtsov^^Weinber JboOft l 



H1 l2002l h which would 

mean that many small haloes are present by dark. The latter hypothesis is readily tested with gravitational lensing, which is 
sensitive to the distribution of both luminous and dark matter over a range of scales in galaxy haloes. 
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Strong gravitationa l lensing has long been known to probe galaxy mass distributions on kiloparsec scales (e.g.. Irlefsdall 
Il964 lYoung et alJll98ll: lKochanelJll99 jl. and even to be sensitive to th e fine graininess of stella r mass distributions (mi- 
crolensing; e.g., IChang fc Refsdallll979l:IWambsganssll200lh . More recently. IMao fc Schneid"erl lll998l) pointed out that lensing 
is also sensitive to structure on intermediate scales (£ ~ few pc, M ~ 10 6 M©). This effect, sometimes termed 'millilensing,' 
could solve a long-standing problem in lens modeling. Detailed mass models of quadruplv-ima ged lenses are quit e successful 
at matching the relative positions of the images, but often fail to reproduce the relative fluxes. IMao fc Schneider! pointed out 
that intermediate-scale substructure cou ld nicely explain the 'ano malous' flux ratios in radio lenses, taking the troublesome 
lens B1422+231 as their example. Later. iMetcalf fc Madaul i200ll) connected this idea to the predictions of ACDM and sug- 
gested that t he statistical distribution o f lens flux ratios could be used to test the hypothesis that galaxies contain significant 
substructure. iDalal fc Kochariekl i2002l) carried out the statistical test for seven quadruply-imaged quasars to infer that the 
fraction of galactic mass in substructure is 2.0j^'° per cent (90 per cent confidence), which seemed to match the amount of 
substructure expected for ACDM, and to rule out modified dark matter models. 

Connecting observed flux ratio anomalies to inferences about dark matter requires a fairly long chain of logic, whose 
strength is still being assessed. The very first link is the identification of flux ratio anomalies. Careful analysis of the lens 
mapping reveals model-independent relations between certain images in 4-image lenses wit h 'cusp' or 'fold' configurations, 
relati ons which can only be violated if the lens galaxy contains significant small-scale structure jKeeton et alj20o3lGaudi et alJ 
2005). As valuable as that analysis is, it only reveals which lens systems contain anomalies; it does not pinpoint which individual 
images are anomalous. This issue is crucial because l ens theory predicts fundamental differences in how positive- and negative- 
parity images are affected by small-scale structure ijSchechter fc Wambsganssll2002l : iKeetonlEoO^ . whic h offers a key test o f 
the substructure hypothesis that may rule out competing explanations o f flux ratio anomalies ( see [Kgchgnek fcDala ti2004h . 
Furthermore, the large number of substructures implied by ACDM (e.g.. ICoorav fc Shethll2002l : IShet^i^^hiuTll20oli rsuggests 
the possibility that more than one image could be perturbed, but previous analyses have not determined whether that actually 
occurs. One goal of our analysis is to revisit models of flux ratio anomaly lenses to see if we can figure out which of the images 
are affected by substructure. 

Another link in the chain of logic involves determining the length or mass scale associated with flux ratio anomalies. 
According to the substructure hypothesis, the anomalies are caused by mass clumps in the range M ~ 10 6 -10 8 Mq, corre- 
sponding to a length scale of a few to tens of parsecs. Howev er, violations of the cusp and fold relations really only indicate 
structure on the scale of the separation between images (see iKeeton et all l2003h , which is typically no smaller than a few 
tenths of an arcsecond, or hundreds of parsecs. The differe nce in scale means t hat subs tructure cannot yet be es tablished as 
the only viable explanation for flux ratio anomalies (e.g.. lEvans fc Wit jboost but see lKochanek fc DalallEoO^l . Moreover, 
even within the substructure hypothesis, comparisons between the predicted and inferred amount of substructure are very 
sensitive to scale. 

The size of the source quasar brings an additional scale into the problem. Heuristically, a source 'feels' lensing structure 
only on scales la rger than itself . Combining c onventional wisdom about structure in lens galaxies with the standard model 
of quasars (e.g., |PetersoiJll997t lKromJll999h . it is believed that quasar optical continuum light is very sensitive to both 
microlensing and millilensing; that the optical broad emission lines are certainly sensitive to m illilensing and may or may 
not be affected by microlensing (see lAbaias et al"ll2002l : [Lewis fc Ibatall2004l ; iRichards et al1l2004l) ; that the radio and mid-IR 
light can only be affected by millilensing; and that the optical narrow emission lines should not be affected by any small-scale 
structure. Measuring t he flux ratios associated wi t h several different source sizes could therefore provide a way to determine 
the substructure scale (|Moustakas fc Metcall2003h . balal fc Kochanekl ( 2002 ) used thes e ideas in a general way, selecting radio 
lenses in order to focus on millilensing (and ignore microlensing). Iwisotzkiet alJ l)2003h compared the optical continuum and 
broad l ine flux ratios for HE 0435—1223 to infer that there must be microlensing in that system, and maybe some millilensing 
as well. lMetcalf et all i2004l) compared the optical narrow line flux ratios with the radio and mid-IR flux ratios for Q2237+0305 
to find evidence for millilensing and place limits on the substructure mass scale. 

Despite this evidence for the value of working with different source sizes, there has been no general study of source size 
effects in millilensing. The second main goal of our paper is to present a semi-analytic toy model that allows us to examine a 
wide range of finite source effects. Assuming that any given flux ratio anomaly is caused by a single, isolated clump that can 
be modeled as an isothermal sphere is admittedly a toy model - but in the best sense of the term: a too l that not only reveals , 
but also elucidates, some interesting general principles. As we completed our work, we learned that llnoue fc Chibal l|2004h 
recently considered the same toy model and derived analytic approximations for the millilensing magnification in the limit of 
a large source. Our work complements theirs by presenting exact results for a large range of source sizes, by considering some 
of the effects in more details, and also by applying the general theory to four specific observed lenses. 

Thus our paper has two main goals: to better understand the flux ratio anomalies in four observed radio lenses; and to 
study finite source effects in millilensing. The two parts are independent of each other, although we do combine them in the 
end to place constraints on the substructures required to produce the observed flux ratio anomalies. Pedagogically, it makes 
sense to begin with the study of finite source effects. In §2 we develop our toy model for millilensing and use it to study the 
image configurations and magnifications for different source sizes and positions; the discussion goes into some depth, so we 
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offer a review of the main points in H2.6I In §3 we introduce a method for using our millilensing theory can to place lower 
bounds on the masses of substructures responsible for flux ratio anomalies. In §4 we turn to the analysis of real lens systems; 
we first use lens models to determine which images are anomalous, and then apply our millilensing theory to derive the 
substructure mass bounds. We summarize our results and conclusions in §5. Throughout the paper we assume a cosmology 
with Q M = 0.3, 0\ = 0.7, and h = 0.7. 



2 SIS IMAGES OF A FINITE SOURCE 

This section presents our toy model for studying the effects of a finite source size. We define the model and then consider the 
image configurations and magnifications as a function of the position and size of the source. Although the focus is millilensing, 
our results have some broader implications that are discussed in H2.6I 



2.1 Macromodel 

In lens modeling, it is common to begin with a smooth 'macromodel' that reproduces the number and positions of the lensed 
images. Smooth models generally fail to fit the observed flux ratios, so small clumps are introduced that modify the flux 
ratios enough to fit the data. (The clumps may modify the image positions as well, but not by much more than current 
measurement errors; see >I2.5I ) To understand the effects of a clump near one of the lensed images, we zoom in and consider 
the lens mapping only in the vicinity of the clump. The clump is small compared with the galaxy (-R c i um p/-Rgai ~ 10~ 3 ), so 
on this scale the macromodel can be approximated as a constant convergence and shear (k and 7, respectively). The image 
magnification predicted by the macromodel is 

^o = \(l-T)- 1 \ = {1 _J )2 _^ , (1) 
where 1 — V is the local lens mapping (in coordinates aligned with the local shear for simplicity) , 

1-1'= ■ (2) 



l-K-7 

1 - K + 7 



We can distinguish between three types of images based on the eigenvalues of this matrix. Positive parity images have 
1 — k + 7>1 — k — 7>0. Negative parity images have 1 — k + 7>0>1 — k — 7, so they are parity reversed in one direction. 
Double negative parity images have 1 — k — 7 < 1 — ft + 7 < 0, so they are parity reversed in both di rections; however, images of 
this t ype are faint, rarely observed, and of relatively little importance for substructure lensing (e.g.. lWinn. Rusin. fc Kochanekl 
l2004h . For the special case of an isothermal ellipsoid macromodel, k = 7 everywhere. 

Many studies of millilensing have assumed that the clump lies in the halo of the main lens galaxy, but iKeetonl J2003I) 
and lMetcalJ ll2004h have pointed out that a clump elsewhere along the line of sight could still have a significant effect. While 
one c an invoke statisti cal arguments about whether a clump is more likely to lie in the galaxy or along the line of sight 
strictly speaking the clump redshift is unknown and that may l ead to a system atic uncertainty in 
a millilensing analysis. Fortunately, this effect is easily accommodated in our formalism. IKeetonl J2003) showed that if the 
clump does lie at a different redshift than the lens galaxy, the macromodel can still be treated as a simple convergence and 
shear, but with effective values 



Kcff = 



(1-/?)[k-/3( K 2 -7 2 )] 



(1 - PkY - (£7)2 



7cff ~ (1 - /3 K ) 2 - (/3 7 ) 2 ' 1 > 

where (5 = {D c iD oa ) / (D iD C3 ) for a foreground clump (z c < z{), while j3 = (Di c D os )/(D oc Di s ) for a background clump 
(zi < z c < z a ). In what follows we simply use k and 7 to denote the macromodel, bearing in mind that we should use the 
effective values if we want to consider a line-of-sight clump. 

Another possible systematic effect arises from the 'mass sheet degeneracy' in the macromodel. Addin g a uniform mass 
sheet (and rescal ing the galaxy mass appropriately) leaves the image positions and flux ratios unchanged iGorenstein et all 
ll98gUSahall2000h . Turning the probl em around, lens models ca nnot detect the presence of a mass sheet, which can bias the 
conclusions drawn from the models jKeeton fc ZabludofdEool . We should therefore consider how a mass sheet might affect 
a millilensing analysis. For our purposes, adding a mass sheet of density Atsheet is equivalent to a simple rescaling of the 
macromodel: 

ft = (1 ^shcct)^ ~\~ K-sheet 7 

7' = (1 - Ksheet)7- (4) 
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This rescaling is the same no matter whether the clump lies in the lens galaxy or along the line of sight. 
2.2 Micromodel 

We model the mass clump as a singular isothermal sphere (SIS). One of the advantages of the SIS is that its p oc r~ 2 density 
profile yields a simple form for the deflection angle, 

a ( x ) = & y^| , (5) 
where the Einstein radius b is 

< 6 > 

Here, a is the velocity dispersion of the SIS, while D oa and Di s are angular diameter distances from the observer or lens 
to the source. Although N-bod y simulations predict a different form for the density profile of dark matter structures (e.g. , 
Njvjxr^F^nl^fc WM teil99fj) , the SIS has been used for modeling substructures in previous studies l|Metcalf fc MadauhOQll : 



Dala i fc Kochanekll2002r) . and its simplicity makes it an attractive choice for a toy model whose purpose is to yield general 



-i 
M = 



= A*o 1 - -(1-K-7COS20). (9) 
r 



insights. The mass of the SIS increases linearly with radius, and the projected 2-D mass within the Einstein radius is 

where E cr = c 2 D os / (4irGD iDi s ) is the critical surface density for lensing. 
The clump + macromodel system is governed by the lens equation 

u — (1 — k — 7)^ cos 9 — b cos 9 , 

v = (1 — k + j)r sin 6 — b sin 9 , (8) 

where u= (u,v) are coordinates in the source plane and x= (r cos 9, r sin 6) are coordinates in the image plane (centred on 
the clump). In substructure lensing, solutions of this lens equation represent 'micro-images' that are not separately resolved 
but combine to form the observed macro-image. For a point source, the individual micro-image magnifications are given by 

Ou 
dx 

The tangential critical curve for the lens system can be found by taking fi — > oo, which yields 
b (1 — k — 7 cos 29) 

rcrit = -^71 y 2 2 ■ 10 

(l — ny — 7^ 

Plugging this into the lens equation then gives a parametric equation for the tangential caustic. The radial pseudo-caustic is 
the curve in the source plane that maps to the origin in the image plane; from eq. |8"). it can be written parametrically as 

u p = —bcos9, (11) 
Vp = -6sin0. (12) 

It can be seen from equations @ and @ that the positions and magnifications of the images of a point source depend on 
the perturber strength b, the position of the source relative to the perturber, and k and 7 from the macromodel. There is no 
general analytic solution to the lens equation even for a simple SIS perturb er. Nevertheless, it is possible to find an analytic 
solution for a source of finite size at an arbitrary position. iFinch et alJ i2002^ showed how to compute the area enclosed by the 
caustics of an SIS lens in an external shear field, and in the following sections we extend their method to find the positions, 
shapes, and magnifications of the images of a finite source lensed by an SIS in a convergence and shear field. 

2.3 Analytic solution for the images of a finite source 

First, we consider a circular source and parametrize its boundary: 

it = ito + acos(A), (13) 
v — vq + a sin(A) , (14) 

where uo and Vq are the coordinates of the centre of the source, a is the source size, and A varies from to 2-7T. jFinch et alJ 
considered the special case no — ^0 — and a = b.) Plugging the source boundary into the lens equation JHJ yields 

acos(A) = (1 — k — 7)^ cos 9 — b cos 9 — uq , (15) 
asin(A) = (1 — k + y)r sin 9 — 6sin# — vq . (16) 
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We can eliminate A by squaring and adding these two equations to obtain 

= r 2 T 2 . cos 2 9 + r 2 Y 2 + sin 2 9 - 2 u Y- r cos 9 - 2v T + r sin 9 - 2 T_ r b cos 2 9 - 2 T+ r b sin 2 9 

+Uq + v 2 ) + b 2 — a 2 + 2 uo bcos9 + 2vo bsin(9 , (17) 

where F± = (1 — k ± 7). This is a quadratic equation for r(9) whose solution yields the boundary of the image(s): 
M*) = ^, (18) 



c 



where 



A = -2u (7 + k - 1) cos 9 - 26(7 cos 29 + k - 1) + 2v (7 - ft + 1) sin 9, 

B = -4 [V + (k- l) 2 +27(k;- l)cos20] [b 2 - a 2 + u 2 , + v 2 + 2bu cos9 + 2bv sin0] 

+4 [u (k + 7 - l)cos0 + &(k + 7 cos 29 - 1) + u (k - 7 - l)sin0] 2 , 
C = 2 [7 2 + (k - l) 2 + 27(k - 1) cos 20] . 

Though complicated, this is a completely analytic mapping of the boundary of the source to the boundary of the image(s). 

While eq. 1181 completely describes the image boundary, it is important to note that only solutions with r±(9) real 
and positive are physical. For some parameter combinations, B can be negative which implies that \/B, and hence r±(9), is 
complex. In particular, for given values of (7, k, a, b, uo,vo), B may or may not be negative for a particular 9. The range of 9 
for which B ^ defines the azimuthal extent of the image(s). In addition, there are also parameter combinations for which 
r±(9) < 0. Such solutions are not physical and form the boundaries of an 'artefact' image. 

The r±(9) solutions shown in Fig. exhibit both of these features. For this example, k = 7 = 0.3 so the unperturbed 
image has positive parity and magnification /10 = 2.5. Panels (a) and (b) show r±(9) which are real for only certain values 
of 9. Physically this implies that the images have a finite azimuthal extent, as can be seen in panel (c). Panel (b) also shows 
that there is a range of 9 where r_(0) < 0. This corresponds to the unphysical artefact image shown by the dotted line in 
panel (c). 

Fig. |3 shows how image configurations change as the source size a is increased. The left column shows the source and 
the caustics while the right column shows the images and the critical curves; without loss of generality, we work in units with 
6=1. For the top row, the source with a = 0.01 has been placed near a fold caustic but has not intersected it. The source 
lies completely within a two-image region and the r(9) solution does in fact give two images. In the second row, the source 
has doubled in size and now crosses the fold caustic. In this configuration, part of the source lies in the two-image region and 
another part lies in the four-image region. The r(9) solution shows that the initial two images have grown slightly in size and 
a third image (which is actually a merged image pair) has appeared in the upper left. As the source size is increased further, 
it crosses more and more of the caustics yielding complex image solutions which consist of merging and growing images. By 
a = 1.2, the source covers most of the caustic and the resulting image is clearly becoming the ellipse that one would expect 
for a simple convergence and shear field. Nevertheless, even at this large source size there are still significant deviations from 
the unperturbed image. 

Fig. E]is similar to Fig. |5] except that the unperturbed image has negative parity with k = 7 = 0.7 (fio = —2.5), and 
we use a different source position. Between a = 0.04 and a — 0.15, the source crosses the caustic separating the two and 
three-image regions, and the image configurations show the appearance of a small third image near the origin. By a = 0.2, 
the source has intersected the caustic again, which corresponds to the merging of two of the images. Increasing the source 
further results in the growth and merger of the images, and by a — 1.2 we are again beginning to see the ellipse that would 
be expected from only the convergence and shear field. 



2.4 Magnification of a finite source 

Having found the image configurations for finite sources, we now seek the magnifications. Since gravitational lensing conserves 
surface brightness, the change of flux is due solely to the change in size of the source when it is lensed. If the source has a 
uniform surface brightness, then the magnification is the ratio of the area of the image(s) to the area of the source. 

Our parametric solution for the image boundaries allows us to compute the image area, if we take care to understand the 
different solution regimes. Where r+(0) and r-{&) are real and positive, they form the outer and inner boundaries (respectively) 
of the images (see Fig. The image area is then 

\J[r%(O)-rl(O)]d0, (19) 

where I is the range of 9 over which the solution is defined (i.e., where B ^ 0). If only r+{9) is real and positive, it forms 
the complete boundary of the image and the image area is I J I rl(9)d9. Finally, where r+(9) and r_(0) are both negative, 
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Figure 1. Analytic image boundary solutions for ft = 7 = 0.3, (uq,vq) = (1.2,0.8), and (a, b) = (1.2,1.0). (a) rj r (9) versus 9. The 
region of 8 where r+(8) is defined gives the azimuthal extent of the images, (b) r—(9) versus 9. The dotted line indicates where r-(9) 
is real but negative, which corresponds to an unphysical 'artefact' image, (c) Image boundaries in the (x, y) plane. Solid lines show the 
physical solutions, while the dotted line shows the artefact image with r_ (9) < 0. 



there is no contribution to the image area. (Note that r- < r+ for all parameter values and all 9, so there is never an area 
contribution due to r-(9) alone.) 

Thus, the total magnification (ratio of image size to source size) can be written as, 




.(9) if B > and A > y/~B 
if B > and A < VB 
HB<0orA + VB<0 



(20) 



Since we are primarily interested in how much the flux changes due to the presence of the SIS perturber, we have normalized 
eq. I12UI with respect to the magnification no produced by the convergence and shear alone. We refer to this as the 'normalized 
magnification.' Unfortunately, the integral in eq. 1201 cannot be evaluated analytically. Still, it requires only a one-dimensional 
numerical integral, which means that the anal ytic solution of the le ns equation yields a much faster calculation than a 
conventional two-dimensional numerical integral. Ilnoue fc Chibal J2004) give analytic approximations for M in the limit of a 
large source (also see Appendix A), but we are interested in the exact result for a wide range of source sizes. 

We can now understand the effects that the mass sheet degeneracy in the macromodel have on the substructure analysis. 
Adding a mass sheet rescales the macromodel as shown in eq. JIJ. From eqs. Q and Ijl8^ . we then see that no and r± are 
both rescaled as (1 — ft s hcct)~ 2 . As a result, the normalized magnification M is unchanged by the addition of the mass sheet. 
We conjecture that this result is special to the SIS clump model, and it would be interesting to consider other clump models. 
However, for our purposes the remarkable implication is that our substructure analysis is completely unaffected by the mass 
sheet degeneracy. 

Fig. shows the normalized magnification as a function of source size for the source from Fig. H The curve has several 
notable features that can be understood in terms of the image configurations in Fig. [5] First, for a < 0.01 the source does 
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Figure 2. Image configurations for different source sizes. The left column shows the source boundaries and caustics, while the right 
column shows the image boundaries and critical curves. In these examples, k = 7 = 0.3, (uo,vo) = (0.2,0.6), 6=1, and a is increased 
from 0.01 to 1.2. At large source size (a/b > 1) there are still significant deviations from the ellipse image expected for a pure convergence 
and shear field. 
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Figure 3. Similar to Fig. [5] but for negative parity. In these ex; 
from 0.04 to 1.2. 



amples, n = 7 = 0.7, (uq,Vq) = (0.9,0.15), 6=1, and a is increased 
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Figure 4. Image configuration for k = 7 = 0.3, (uq,vq) - 
outer boundary of the images. The dotted line shows r_(0), 



= (0.2,0.6), (a, b) = (0.1,1.0). The solid line shows r+(6), which forms the 
which forms the inner boundary. 



not come into contact with the caustics and the magnification is basically independent of source size. The sharp increase in 
magnification between a = 0.01 to 0.02 corresponds to the appearance of a third image as the source begins to cross the fold 
caustic. The magnification then comes to a large peak, followed by a smaller peak near a — 0.8. Comparing to the fourth row 
of Fig. |21 we see that this secondary peak occurs when the source begins to come into contact with the caustic cusps. Finally, 
as a becomes large, the magnification approaches unity; that is, for large sources the effect of the SIS perturber becomes 
negligible, as expected. Nevertheless, it should be noted that even at a/b = 50, the image flux is still perturbed by 3.7 per 
cent. 

Fig. ED shows the normalized magnification as a function of a for the source from Fig. The parameters here are the 
same as the previous figure except that this case has negative parity (ft = 7 = 0.7). For an SIS clump in front of a negative 
parit y image, most source positions yield demagnification relative to the background convergence and shear field jKeetonl 
2003). Indeed, for this position the normalized magnification is less than 1 for most source sizes. However, there is a region of 
magnification (relative to the convergence and shear field) for 0.1 < a < 1.0. We can again match some of the features of this 
plot to the image configurations from Fig. |3] There is very little change in the normalized magnification until a m 0.1, which 
corresponds to the source coming into contact with the caustic. There is a peak at a as 0.2 as the source starts to come into 
contact with the caustic cusp, followed by a shallow dip at a « 0.3 as the source begins to occupy more of the demagnification 
region of the source plane. As the source size is increased further, there is then another maximum followed by a minimum at 
a ~ 2. The normalized magnification for this negative parity case also approaches unity for large a, with the flux at a/b = 50 
differing from unity by 1 per cent. Comparing the positive and negative parity cases gives the interesting result that, at large 
a, an SIS perturber has less effect on the magnification of a negative parity image than on an equivalent positive parity image. 

Since the magnification is largely determined by encounters with caustics, we now study how the magnification versus 
source size curve changes as the source position is varied. Fig.0shows the normalized magnification curves for different source 
positions, for a positive parity case (« = 7 = 0.3). The upper left panel shows the caustics and source positions for the plot's 
other panels, and the upper right panel shows the normalized magnification curve for the source located at the origin. 

Most of these curves have at least one, and in some cases two, peaks where the normalized magnification increases sharply. 
These peaks are, in general, associated with the source boundary crossing a caustic. It should also be noted that, while some 
peak heights are relatively low with a normalized magnification of around 2, the normalized magnification can become as high 
as about 16 (second row middle column). This case is particularly illuminating as the main peak turns out to coincide with 
the source crossing the upper cusp caustic while the secondary peak around a ~ 1.0 corresponds to the source coming into 
contact with the left and right cusps. The next panel (row three column two) is also notable for the plateau at low values of 
a, and again there is a small secondary peak around a ~ 1-2 that corresponds to the source coming into contact with the left 
and right cusps. Finally, two general features of the plots are striking. First, all of the curves remain fairly constant at small 
a where the source does not intersect the caustics of the SIS, implying that the source does not 'feel' the structure of the 
perturber before it comes into contact with these caustics. Second, although all of the plots tend towards unity as expected 
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Figure 5. Normalized magnification as a function of source size for k = 7 = 0.3, («o,^o) = (0.2,0.6), 6 = 1. The sharp increase at 
a ss 0.015 corresponds to a transition from two images to three (see text). 
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Figure 6. Similar to Fig. [5] but for k = 7 = 0.7, (no, do) = (0.9,0.15), 6=1. The features of the curve are due to the source coming 
into contact with the caustics, as shown in Fig. [3] 
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Figure 7. Normalized magnification as a function of source size for a positive parity case (re = 7 = 0.3), with b = 1. The top left panel 
shows the caustics and source positions. The top right panel shows the magnification curve for a source at the origin. The left column 
represents moving the source along u-axis, uq = 0.4,0.8,1.2 (top to bottom); the middle column represents moving along the v-axis, 
vq = 0.5, 1.1, 1.8; and the right column represents moving along the line vg = uq, with uq = 0.2,0.5, 1.0. 



for a large source, they do seem to deviate from unity fairly uniformly at a > 10; in particular, all of the magnifications are 
M — 1.2 at a — 10. In Appendix A we formalize this result by showing that the normalized magnification is independent of 
source position, to first order in 1/a. Sinc e it is possible to measure flux ratios with percent-level precision (after correcting 
for time delays; see iFassnacht et alJbooj) . an important implication is that even large sources relatively far from the mass 
clump can be perturbed at a detectable level. 

Fig. [HJshows the normalized magnification curves at various source positions for a negative parity case (k = 7 = 0.7). 
The behavior for the negative parity case is a bit more complex in that both magnification and demagnification (relative to 
the convergence and shear field) are seen. For example, at a source position of uo,vq = 1.1,0.0 (second row first column), the 
source is magnified at low a, rises to a peak at a ~ 0.2, falls to a demagnified valley at a ~ 2, and then rises toward unity 
for larger a. As in the positive parity case, we see that for small a there is relatively little structure in the curves. Also, the 
magnifications are again fairly uniform for a > 10 (see Appendix A), with M ~ 0.95 at a — 10. 
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Figure 8. Similar to Fig. |7| but for a negative parity case (k = 7 = 0.7). 



2.5 Center of flux position 

Although we have determined the image boundary and magnification of a finite source, we have not yet quantified the image 
position. If the separate component images are not resolved, what matters is the 'centre of flux' of the image configuration, 
which can be computed as 

I x fix) dx 

X = J „ - ; , (21) 

J /(«) dx 

where f(x) is 1 inside the image and outside. (This is the optical analog of the centre of mass.) The difference between this 
'centre of flux' and the original image position in the absence of the clump, Xo, is the astrometric shift due to the perturber, 

5X=X-X . (22) 

Our solution for r± (9) yields a straightforward calculation for the astrometric perturbation, provided that we account for the 
different solution regimes as in eq. 1201 . 

Fig.EJshows the astrometric perturbation as a function of source size for the source from Fig.|3 As with the magnification 
calculation, we see that for a source size below about a = 0.015 the astrometric perturbation remains fairly constant. 
Comparison with the image configuration from Fig. [2] shows that the centre of flux position lies on the line joining the two 
images that are seen at a = 0.01. Between a — 0.01 and a = 0.02 there is a sudden change as the source intersects the caustic. 
The emergence of a bright third image rapidly pulls the centre of flux away from the line joining the initial two images. 
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Figure 9. Astrometric perturbation as a function of source size for 6 = 1, (uq,vo) = (0.2,0.6), and re = 7 = 0.3. The large change 
between a = 0.01-0.02 corresponds a change from two to three images (see Fig.|5J. For a large source, the perturber has little effect on 
the centre of flux position as expected. 




For a large source size, we would expect that the perturbing SIS would have little effect on the centre of flux of the image 
configuration and indeed as a gets large, the astrometric perturbation tends towards zero. 

Fig. 1101 shows the analogous results for a negative parity case with k — 7 = 0.7. There is not much change in the 
astrometric perturbation when a is less than about 0.1, but there is then a dramatic dip in both curves between a — 0.01 
and 0.02. Returning to Fig. |3J we see that this large change in the centre of flux position occurs when the source begins to 
intersect the caustic. 

Like stellar microlensing (e.g., iTrever fc Wambsganssll2004l) . substructure lensing can produce astrometric shifts of order 
several Einstein radii. The difference is in the scale: the Einstein radius for stars is of order micro-arcseconds, while for 
substructure it is milli-arcseconds. Thus, astrometric perturbations due to substructure should be detectable with radio 
interferometry, and perhaps even with space-based optical or infrared imaging. While small position shifts might be degenerate 
with small changes in the macromodel, the dependence of the shift on source size (and hence wavelengt h) wou ld provide a 
clear signature of substructure lensing. Astrometric shifts are related to shape perturbations (see iMetcalJ 120021) . but do not 
require resolved image shapes. A full analysis of prospects for observing astrometric perturbations and using them to constrain 
substructure is beyond the scope of this paper, but warrants further study. 



2.6 Comments 

To review, we have studied how the magnification depends on the size of the source and its position relative to the caustics. 
The specific details depend on our assumption of an SIS clump, which is a toy model, but the basic principles should be more 
general. One point is that the image properties are basically independent of the source size until the source is large enough 
to encounter the caustics; that threshold of course depends on the source position. In the other extreme, sources more than 
an order of magnitude larger than the clump Einstein radius can still be perturbed at the percent level, and that precision 
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Figure 11. Normalized magnification as a function of b/a for a positive parity image (a) and a negative parity image (b). The crosses 
represent the maximum (maximized over source position) or minimum magnification for a given b/a. The solid lines show Ai versus b/a 
at various fixed source positions, from Figs. 171 and 151 



can be obtained with careful observations (e.g-. lFassnacht et al.ll2002T) . In other words, the conventional wisdom that a source 
does not 'feel' lensing structure on scales smaller than itself is not really accurate beyond order of magnitude estimates. 

In between these extremes, there is significant structure in the magnification versus source size curves. Therefore, in 
principle, comparing the flux of an image at different wavelengths corresponding to different source sizes could reveal a 
wealth of information about the size and location of substructure. Prospects for doing that are good: recent observations have 
demonstrated the abilit y to measu r e flux ratios not only for radio and optica l continua, but also for optical emission lines and 
mid-IR emission iAgol e t al. 2000; Wisotz ki et alJl2003t iMetcalf et alJl2004f) . We woul d advocate concerte d effort to obtain 
and analyse such panchromatic observations of lenses with flux ratio anomalies (see IMetcalf et alJ 120041 for an example) . 
Even more exciting is the possibility of measuring astrometric shifts along with flux perturbations; further study is needed to 
determine the feasibility and value of such measurements. 



3 PLACING LIMITS ON SUBSTRUCTURE SIZE 

While a few lens systems have been observed at many wavelengths, the ones that are most interesting for millilensing are 
still limited to radio continuum observations (plus perhaps broad-band optical data). Nevertheless, it is still possible to place 
important lower bounds on the substructure responsible for observed flux ratio anomalies. In this section we customize our 
general analysis of substructure lensing to this application. 



3.1 Maximally affected images 

We have seen that changing the position of the source relative to the perturber (or vice versa) has a dramatic effect on the 
magnification versus source size curves. However, in general we do not know the relative position, so it is useful to determine 
the bounds on the magnification that can be produced by a given perturber for a given source size. In practice, this amounts 
to setting the ratio b/a of perturber and source sizes (as well as the background field re and 7), and then maximizing or 
minimizing the normalized magnification over uo and vq. Fig. II lh shows the bounds as a function of b/a for a positive parity 
image with re = 7 = 0.3. For comparison, the solid lines show curves for fixed source positions (from Fig.UJ. (The lower bound 
is trivial, M = 1, since an SIS perturber in front of a positive parity image never produces demagnification.) At small b/a, 
all of the curves have roughly the same behavior, again illustrating that at large source size the change in magnification is 
independent of position. At large b/a, .Mmax grows to infinity since, in the limit of an infinitesimal source, placing the source 
on the caustic yields infinite magnification. 

The analogous results for a negative parity case are shown in Fig. II lb . Here we have both A^max and -M m in curves since 
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Figure 12. Maximum and minimum normalized magnification versus b/a, for different values of re and 7. 



an SIS in front of a negative parity image can produce both magnification and demagnification. 1 The curves of M versus 
b/a at various source position from Fig. [5] are again shown for comparison. An important feature is that the Almm curve 
approaches a constant value at large b/a. In the limit of an infinitely small source, a negative parity image allows infinite 
magnification but not infinite demagnification. The shapes of the Almax and Af m in curves depend on the macromodel through 
k and 7, as shown in Fig. 1121 



3.2 Limits from a single flux measurement 

A key result from Fig.s [TT1 and ll2l is that we can use any measured M 7^ 1 to place a lower bound on b/a, without needing 
to know the relative positions of the perturber and source. This bound comes from the fact that, although the region below 
the A4 m ax curve is completely accessible by choosing appropriate values for Uo and Vo, the region above the Almax curve is 
excluded by definition. For any observed Af bs > 1, we simply find the value of b/a where A4 m ax(&/ a ) = Af bs, and that gives 
us the lower limit on the size of the perturber (relative to the size of the source). The bound can be understood physically 
with the idea that a source cannot 'feel' a perturber that is much smaller than itself. Conversely, there is no upper bound 
because a source that is small relative to the perturber can be placed as far from or as close to the caustics as necessary to 
reproduce any observed magnification. (Similar reasoning applies to both the magnification and demagnification regimes in 
the negative parity case.) 

For the positive parity case, increasing k and 7 (increasing fio) lowers the Almax curve, or equivalently, increases the 
minimum value of b/a required to produce a given normalized magnification (see Fig. 1121 . For the negative parity case, 
decreasing k and 7 (increasing \/io\) lowers both the Almax and A'fmm curves. This is equivalent to increasing the minimum 
value of b/a required to produce a given M > 1, or decreasing the minimum value of b/a required to produce a given A4 < 1. 
Although the lower bound on b/a does depend on k and 7, these parameters are well constrained by the macromodel (see 
Fig. El below). 

If the observed image flux / b s were known precisely, then Alobs = /obs//o could be used to place a strict lower bound 
on b/a. Of course, flux measurement uncertainties smear the bound, and the simplest way to incorporate the uncertainties is 
to define a goodness of fit, 

-. 2 

M m od{b/a;u Q ,v ; — .Mobs 

Cobs 

Fig. lK-il shows a sample x 2 analysis, where we generated a mock measurement of Alobs = 3.63 assuming k — 7 = 0.3, a = b — 5, 
1*0 = vo = 0, and <r bs = 0.1 x Alobs- In the figure, the solid line shows x 2 versus b/a if we fix the source at the origin, while 
the crosses show the result if we optimize over the source position. (We always fix n and 7, because they are determined well 



2 (b 

vsub 1 1 

\a 



(23) 



1 Again, by magnification or demagnification we mean images brighter or fainter than produced by the convergence and shear field alone. 
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Figure 13. \ 2 versus b/a for a mock measurement of the normalized magnification. The solid line shows the result if we fix the source 
at its input value, while the crosses show the result if we optimize over the source position. 

enough by the macromodel; see iJ4.2l below.'l If the source position were known, we would get both upper and lower limits on 
b/a. If the source position is unknown, we lose the upper limit but still get the lower limit as discussed above. This is the 
more interesting limit anyway, since for a given flux ratio anomaly it is useful to know the smallest possible perturbing mass 
that could produce the anomaly. 



4 APPLICATION TO OBSERVED LENS SYSTEMS 

Before we can apply our millilensing theory to derive substructure mass bounds, we must first figure out which of the images 
are perturbed. We must also determine the convergence and shear that create the background in which the clump lives. To 
do this, the idea is to fit a smooth macromodel to an observed lens, identify any images that cannot be fit, and attribute 
the discrepancy to substructure. We emphasize that this process is independent of any assumptions about the nature of the 
substructure. It does depend on our choice of macromodel, but the models we use a re standard in millilensing analyses (e.g., 
iDalal fc KochaneklEooalKochanek fc Dalalll2004l: iMetcalf et alJl20ollMetcalJl20o4) . Dependence on the substructure model 
enters only when we bring in the method from §3 to derive constraints on the masses of the substructures. 



4.1 Methodology 

For the macromodel, we consider two related models. In the first case, we treat the lens galaxy as a singular isothermal 
ellips oid (SIE), which is a simple but useful model that is consistent with many lensing, dynamical, and X-ray observation s 
(e.g. , lFabbianolll989l iRix et al.lll997t iGerhard et alll200 it iTreu fc Koopmansll2002l : iKoopmans et alJl2003l: iRusin et alJl2003h . 

The model has surface mass density 



n{r,e) = 



2r y / l - e cos 2(9 - t 



(24) 



where R c [ n is the macromodel Einstein radius, e is an ellipticity parameter related to the axis ratio q of the ellipse by 
q 2 — (1 — e)/(l + e), and 8 e is the orientation angle of the ellipse major axis. A simple SIE model is insufficient to fit most 
4-image lenses, so we add an extern al shear term to represent the effects of other mass in the environment of the lens galaxy 
llKeeton. Kochanek. fc Seliaklll99^1 . The N p — 10 model parameters are then: the position, Einstein radius, ellipticity, and 
orientation of the lens galaxy; the amplitude and direction of the external shear; and the position and flux of the source. 



In the alternate macromodel, we keep the same mono 
lens potential in multipoles, we write (see lKochanekl l2004 ) 



(r 7 9) = R c 



,r+^ 7int cos2(0- 
2r z 



+ — 7cxt cos2(6> - 



+ 



(25) 
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The first term is the potential for a singular isothermal sphere (a mass model given by ea. 1241 with e = 0). The second 
term represents the shear due to mass within the Einstein radius, where 7i n t and #i n t are the internal shear amplitude and 
direction. The third term represents the shear due to mass outside the Einstein radius, which can now include a contribution 
from the outer parts of the lens galaxy halo in addition to a contribution from the larger lens environment. Compared to 
the ellipsoid+shear model, the internal+external shear model allows a more general structure for the lens galaxy, but at the 
expense of not representing true elliptical symmetry very well (since the multipole series is truncated). Both macromodels 
have the same number of parameters. 

We define the macromodel goodness of fit to include contributions from the image positions, the image fluxes, and the 
galaxy position (if known) : 

Xtot = Xpos + Xflux + Xgal, (26) 



a _ (a^nod — Xphs) , . 

Xpos — ? At 2 ' ^ ' 



xhux - 2^ — xp ' ' 

J obs 



A,B,C,D 

(fn 



2 \-"-mOQ -"-^ODS I /r\t\\ 

Xgal = T~p > ( 29 ) 

^^■obs 

where x values are the positions of the images, / values are the fluxes of the images, X is the position of the galaxy, and A 
indicates measured uncertainties in the respective quantities. 

Evidence for substructure is revealed when the macromodel fails to fit the observed fluxes, 2 but to understand the 
substructure we need to identify which of the images are perturbed. To do that, we systematically relax the flux constraints 
and refit the macromodel. (We always fit the positions of all four images.) For example, if fitting all four fluxes fails, then 
we try to fit the fluxes of images A/B/C, then images A/B/D, then A/C/D, and finally B/C/D. If one of those cases, say 
A/C/D, does provide an acceptable fit, then we settle on the hypothesis that image B is the one most likely to be perturbed 
by substructure. Should relaxing the flux constraints on one image at a time fail to produce an acceptable fit, we consider 
the six different possibilities for relaxing two of the flux constraints. (There is no point in relaxing three flux constraints, 
because the flux of one image can always be fit trivially.) When we fit all four fluxes, the number of constraints on the model 
is N c = 14 if the galaxy position is known, or N c = 12 if not. With N p = 10 free parameters, we can relax one or two flux 
constraints and still have a model that is overconstrained or at worst has v = degrees of freedom. 

Once we have found a macromodel that reproduces all of the observed positions and some of the observed fluxes, 
we interpret the remaining (discrepant) fluxes as evidence for substructure. We characterize the flux perturbation by the 
ratio .Mobs = /obs//mod of the observed flux to that predicted by the macromodel. We can then plug this value into our 
substructure analysis to find the smallest size of an SIS clump that could produce that perturbation (as discussed in §3). 
The substructure analysis depends on the macromodel through the local convergence and shear, but we show below that 
the statistical uncertainties are small and unimportant for our analysis. In other words, formally we take the substructure 
goodness of fit from eq. (1231 . hold k and 7 fixed from the macromodel, and then optimize over 110 and vq to trace out Xsub 
as a function of b/a. We then use this function obtain a la lower limit on b/a. We conservatively assume 10 per cent flux 
unc ertainties in the substr ucture analysis, dominated not by measurement uncertainties (which can reach the per cent level; 
e.g., iFassnacht et al.ll2002tl but by systematic effects such as time delays. Modifying that assumption would produce a fairly 
simple change in our mass bounds (see the Appendix), but would not affect our conclusions. 

To convert the limit on b/a into a minimum Einstein radius & m i n and then to a minimum mass within that Einstein radius 
M(6)min, we must speci fy a source size a. It has been argued that a lower bound on the s ize of the emitting r egion of a quasar 
in the radio is a > 1 pc iwvithe et aljEi)02l) . and that a reasonable size is a ~ 10 pc fsee lMetcalf et al1l2004l) . The source size 
does contribute uncertainty to our analysis, but we shall see that it does not really affect our conclusions. 



A,B,C,D 
(-^Lmod — -^Lobs) 



4.2 B1422+231 

The 4-image radio and o ptical lens B1422+231 was the first system identified as likely to contain substructure based on 
its anomalous flux ratios ijMao fc Schneideilll99sF) . The fluxes of images A, B, and C violate the relation f\ — fB + fa ~ 
generically expected for a lens in a 'cusp configuration' corresponding to a source lying near a cusp caustic iSchneider fc Weisa 



2 Failure to fit the image positions could also provide evidence for substructure (see i!2.5l . but we expect that in most cases ex- 
isting dataajenoteoodenough to detect this effect. In B1422+231, the position uncertainties from VLBA maps are very precise 
iPatnaik fc Nara simha 2001), but as our formalism is not currently equipped to use astrometric perturbations to constrain substructure, 
we inflate the errorbars. The ability of astrometry to probe substructure certainly deserves further study. 
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Normalized Magnification 
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49.17 


bed 


1.21 


1.01 
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3.07 


0.89 


ab 


1.00 
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0.816 


0.649 
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0.90 


ac 


1.02 
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0.704 


97.99 


35.69 


ad 


1.00 


0.829 


0.820 


0.793 


10.29 


6.83 


be 


1.21 


1.01 


0.993 


0.909 


2.40 


0.56 


bd 


1.19 


1.00 


0.977 


0.909 


2.41 


1.16 


cd 


1.23 


1.04 


1.00 


0.909 


1.71 


0.94 



Table 1. Modeling results for B1422+231 using an ellipsoid+shear macromodel. Column 1 lists the image fluxes that were used to 
constrain the macromodel. Columns 2—5 list .Mobs, or the ratio of the observed magnification to that predicted by the macromodel, for 
the four images. Columns 6-7 give the total \ 2 and the contribution from the fluxes. 



199 1 lMaolll992 - but sec iKeeton et all [20031. To model the lens, we use the image positions and fluxes from the 8.4 GHz 



Patnaik et alJ l)l999h . Their VLBA maps yield very precise relative positions, but we conservatively inflate 



observations by 

the uncertainties to 5 mas becaus e we do not study astromet ric perturbations in detail in this paper (see H2.5fl . The radio 
fluxes are essentially co nstant fsee | Pa tnaik fc Narasimhall200l} ). so we can neglect systematics and use the flux measurement 
uncertainties quoted bv lPatnaik et al!Ti^99l) . We use radio rather than optical fluxes because they should be sensitive only 
to dark matter substructure (not to microlensing by stars). For the lens galaxy, we use the position given by CASTLES. 3 

Table □ shows our results for fitting the system with an ellipsoid+shear macromodel. Fitting the fluxes of all four images 
gives a very poor fit {\ 2 jv — 113/4), so we relax the flux constraints one at a time to consider the possibility that one of the 
images might be perturbed by substructure. Fitting the fluxes of A, C, and D yields an equally bad fit, so we can rule out 
the hypothesis that only image B is perturbed by substructure. The same result holds if we fit ABC or ACD. However, if we 
consider A to be perturbed, then we get a good fit with x % l v = 3.1/3. 

We then apply our substructure analysis to this model to find the minimum clump mass required to perturb image 
A. The macromodel has convergence k = 0.381 and shear 7 = 0.496 at the position of image A. In order to produce a 
perturbation of .Mobs = 1.213, an SIS clump in this convergence and shear field must have b/a > 0.0561 (la). Given the 
source redshift z s = 3.62 and lens redshift zi = 0.34, this b/a bound translates to a mass within the Einstein radius of 
M(b) > 2.07 x 10 3 (a/10 pc) 2 M©, or equivalently to a velocity dispersion of a > 2.24(a/10 pc) km s _1 . We emphasize that we 
are quoting the mass within the Einstein radius, and the total clump mass may be much larger. The large lower limit on the 
perturber mass confirms and quantifies the conventional wisdom that microlensing cannot explain radio flux ratio anomalies. 

The substructure analysis does depend on the macromodel, through the local convergence and shear at the position of 
image A. There is an uncertainty in the macromodel between the ellipticity and external shear which leads to an uncertainty 
in k and 7, as shown in Fig. 1141 The effect is small, however: over the la confidence region of the macromodel, k varies by 
about 0.01 and 7 varies by about 0.025. This small variation affects the mass bound by only ~8 per cent, and the velocity 
dispersion bound by even less. Another uncertainty in the macromodel arises from the mass sheet degeneracy, but we saw in 
H2.4l that this has no effect on our substructure analysis. In other words, the macromodel is constrained well enough for our 
purposes. 

If we allow for the possibility of clumps in front of more than one image, we find that there are three models that give 
a good fit to the data (BC, BD, and CD; see Table Ql. All three models still require substructure in front of image A, with 
mass bounds similar to that found for the BCD model. For each model, the la lower limit on the mass in front of the other 
image is zero. That is, we can generically conclude that there must be a clump of mass M(b) > 2 x 10 3 Mq in front of image 
A, and there is no evidence of clumps in front of any other images. 



3 CfA/Arizona Space Telescope Lens Survey; see http://cfa-www.harvard.edu/castles 
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Figure 14. The ellipses show x 2 contours for B1422+231 in the ellipticity-shear plane, showing the la, 90 per cent, 2a, 99 per cent, 
and 3a confidence regions. The dotted contours are (a) k and (b) 7 contours plotted in intervals of Sk = 0*7 = 0.005. The small variation 
in k and 7 over the ellipses implies that uncertainties in the substructure analysis due to the macromodel are small. 
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Image B 


Image C 


Image D 


Xtot 




abed 


1.80 


0.726 


3.14 


17.32 


172.17 


171.24 


abc 


1.84 


0.744 


3.22 


17.59 


81.80 


80.48 


abd 


1.99 


0.805 


3.48 


18.75 


124.29 


122.21 


acd 


0.796 


0.321 


1.39 


7.46 


93.05 


90.94 


bed 


2.05 


0.828 


3.58 


19.33 


150.05 


148.05 


ab 


2.06 


0.829 


3.59 


20.00 


31.55 


30.77 


ac 


0.844 


0.339 


1.47 


8.21 


14.43 


13.66 


ad 


0.912 


0.367 


1.59 


8.54 


82.57 


80.45 


be 


2.12 


0.856 


3.71 


20.67 


56.97 


56.19 


bd 


2.38 


0.960 


4.15 


22.41 


95.25 


93.28 


cd 


0.497 


0.200 


0.869 


4.63 


67.29 


65.02 



Table 2. Modeling results for B2045+265 using an ellipsoid+shear macromodel. 



4.3 B2045+265 

B2045+26 5 is a 4-image radio a nd optical lens with the source quasar at redshift z 3 = 1.28 and the lens galaxy at redshift 
zi = 0.87 jFassnacht et alJll999l) . It is the tight est known cusp con figuration lens, and it exhibits a strong violation of the 
cusp relation in both radio and optical bands iKeeton et aljkooah . We seek to fit the 5 GHz MERLIN radio data from 
iFassnacht et alJ l)l999h . taking radio component E to indicate the position of the lens galaxy. 

Table |5] shows the results of modeling this system with an ellipsoid+shear macromodel. Attempting to fit all of the 
positions and fluxes gives a very bad fit (x 2 jv = 172/4). Relaxing some of the flux constraints yields somewhat better fits, 
with the best case being when we only fit the fluxes of images A and C {\ 2 ' jv = 14.4/2). However, all of these models 
underpredict the flux of image D, by as much as a factor of 20, implying that there must be a clump producing a very large 
perturbing magnification. The problem is that when a clump is placed in front of a negative parity image like D, the cross 
section for significant magnification is very small. Therefore, the large magnifications shown in these models would require 
a very massive clump in a very particular position. Not only would such a large perturbing mass almost certainly result in 
resolvable splittings of image D that are not observed, it would probably affect the positions and fluxes of the other images 
as well. 

These problems lead us to consider the alternate internal+external shear macromodel, whose results are given in Table 
[3] Trying to fit all of the fluxes still gives a poor fit (x 2 /^ = 90/4). Dropping one of the flux constraints improves the fit, but 
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B2045+265 internal+external shear 



Normalized Magnification 



case 


Image A 


Image B 


Image C 


Image D 


2 

Xtot 


2 

Xflux 


abed 


1.88 


0.743 


3.31 


1.00 


89.57 


82.72 


abc 


1.72 


0.741 


2.81 


354.38 


71.89 


71.40 


abd 


2.09 


0.827 


3.68 


1.00 


38.66 


31.74 


acd 


0.841 


0.333 


1.48 


1.00 


20.25 


14.09 


bed 


2.17 


0.856 


3.81 


1.00 


64.26 


57.31 


ab 


1.93 


0.829 


3.15 


381.13 


28.03 


27.53 


ac 


0.852 


0.365 


1.39 


166.94 


11.42 


10.92 


ad 


0.995 


0.393 


1.75 


1.00 


6.31 


0.00 


be 


1.97 


0.846 


3.21 


396.07 


51.22 


50.72 


bd 


2.52 


0.996 


4.44 


1.00 


7.05 


0.00 


cd 


0.565 


0.224 


0.994 


1.00 


5.74 


0.01 



Table 3. Modeling results for B2045+265 using an internal+external shear macromodel. 

even the best case is still not an acceptable fit (the ACD model has x 2 l v = 20/3). Dropping two flux constraints, however, 
yields three acceptable models. Fitting A and D gives x 2 l v = 6.31/2, fitting B and D gives x 2 l v = 7.05/2, and fitting C 
and D gives x 2 l v = 5.74/2; all three models differ from the data by only ~2<r. Although the CD case has the lowest x 2 > it 
requires the positive parity A image to be demagnified by about a factor of two, which is not possible with an SIS clump. 
So we are only left with the AD and BD cases as viable models for the system. The AD model has substructure perturbing 
masses in front of images B (.Mobs = 0.3938) and C (.Mobs = 1.7518) while the BD model has perturbers in front of images 
A (Mobs = 2.5283) and C (.Mobs = 4.4474). 

The results of applying our substructure analysis are shown in Table 0] In the BD model, the substantial magnifications 
of images A and C require large clumps: the mass in front of A must be larger than 2.29 x 10 6 Mq (a > 15.51 km s _1 ), 
and the mass in front of image C must be larger than 1.58 x 10 7 Mq (a > 25.12 km s _1 ). In the AD model, the minimum 
masses necessary to reproduce the anomalous fluxes are somewhat smaller: M(b) > 4.77 x 10 5 Mq (a > 10.78 km s _1 ) for 
image B, and M(b) > 3.71 x 10 5 Mq (a > 9.84 km s _1 ) for image C. (We have again assumed a source size a = 10 pc.) 
Although we cannot identify a unique model, the important points are that the clump masses are S> Mq and therefore 
exclude microlensing as an explanation for the observed flux ratio anomalies, and al so that they agree well with the sizes of 
clumps predicted by a ACDM cosmology (e.g.. iKlypin et allll999l : iMoore et alJll999l) . 

The mass bounds given above were calculated for a clu mp lying within halo of the main lens galaxy. It is possible, 
though, that the clump lies elsewhere along the line of si ght jKeeton||2003]: iMetcalJ Eooi) : the likelihood depends on the 
relative abundances of embedded and isolated clumps fsee lChen et alJl2003T) . As discussed in §2, our formalism can easily 
accommodate the hypothesis that the clump lies at redshift z c 7^ zi (see eq. |3J • Fig. 1151 shows how the clump bounds for 
images B and C vary with clump redshift for the AD internal+external shear model. Moving the clump in redshift away from 
the lens galaxy increases the lower bound on b/a for negative parity image B, and decreases it for positive parity image C. 
(These dependences can be understood in terms of eq.[3]and Fig. 1121 ') However, the effect is only tens of percent over a wide 
range in redshift. A stronger variation is seen in the mass bound, because of the additional redshift dependence in the lensing 
critical density (E cr oc D os / D iD oa ). Even so, the change is a factor of a few, so uncertainty in the location of the clump 
along the line of sight does not significantly affect order of magnitude conclusions. 

4.4 B1555+375 

B1555+375 is a faint 4- image radio lens di scovered bv l Marlow et alJ l|l999h . whose fluxes violate the relation fA — fB~0 
expected for a lens in a 'fold configuration' JCaudi et alJl2005T) . We fit the 5 GHz data from Marlow et al. The position of the 
lens galaxy with respect to the images has not been measured. The lens and source redshifts are not known, but Marlow et 
al. estimate them to lie in the ranges 1.0 < z s < 3.0 and 0.5 < zi < 1.0. 

Our attempts to fit this system with an ellipsoid+shear macromodel result in models with very large and perpendicularly 
aligned ellipticities and shears (e ~ 0.9, 7 cx t ~ 0.3, and A9 = 90°). These models are highly contrived, and have extremely large 
and implausible magnifications. We consider them to be unphysical, and turn instead to internal+external shear macromodels. 

Fitting all four images yields a model with x 2 l v ~ 45.6/2 that reproduces the image positions well but not the flux 
ratios. Relaxing the flux constraints in front of one image can improve the fit, but the resulting models are unacceptable in 
that they require the positive parity image A to be demagnified or the negative parity image D to be highly amplified by a 
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Lens 


Macromodel 


Perturbed 
Image(s) 




Image 


M(b) min 
CM©) 


(km s- 1 ) 


B1422+231 


ellipsoid+shear 


A 


3.073/3 


A 


2.07 x 10 3 


2.24 






A and C 


7.051/2 


A 
C 


2.29 x 10 6 
1.58 x 10 7 


15.51 
25.12 


B2045+265 


internal+external shear 






B and C 


6.320/2 


B 

C 


4.77 x 10 s 
3.71 x 10 5 


10.48 
9.84 


B1555+375 


internal+external shear 


A and C 


1.045/0 


A 
C 


1.96 x 10 5 
3.54 x 10 6 


7.14 
14.72 


B and C 


0.000/0 


B 

C 


1.19 x 10 b 
5.08 x 10 4 


6.30 
5.09 



Table 4. The la lower limits on the mass within the Einstein radius and the velocity dispersion of perturbing clumps. There is only 
one acceptable model for B1422+231, but there are two possibilities for B2045+265 and B1555+375. The bounds scale with the assumed 
source size as M(6) m ; n oc (a/10 pc) 2 and er m ; n oc (a/10 pc). 



clump. The only acceptable models are found when we relax two flux constraints, and in fact there are two good cases (see 
Table 2J. One possibility is to have a clump in front of image B with b/a > 0.328, plus a clump in front of image C with 
b/a > 0.214. This model fits the data perfectly, which is not surprising because it has v — degrees of freedom. Assuming 
redshifts of z s — 2.0 and zi = 0.75, the b/a bounds translate into clump mass limits of M > 1.19 x 10 5 M Q (a > 6.30 km 
s _1 ) and M > 5.08 x 10 4 Mq (a > 5.09 km s" 1 ) for images B and C, respectively. Varying the redshifts can change the mass 
limits by a factor of a few up to ~10, but does not affect the conclusion that the fluxes cannot be explained by microlensing. 
The other possibility is to have a clump in front of image A with b/a > 0.420 [M > 1.96 x 10 5 Mq, or a > 7.14 km s" 1 ), plus 
a clump in front of image C with b/a > 1.783 (M > 3.54 x 10 6 Mq, or a > 14.72 km s" 1 ). This model gives x = 1-045 for 
v = 0, which is formally unacceptable. However, as an exercise we added random noise to the data and refit. A substantial 
fraction of these cases yielded \ 2 = 0, which suggests that the model is in fact consistent with the data given the measurement 
uncertainties. 



4.5 B0712+472 

B0712+472 is a 4-image lens with an image configuration intermediate between a cusp and fold lljackson et alJll99ct). The 
optical flux ratios strongly violate the cusp and fold relations, but at radio wavelengths the violation is marginal llKeeton et all 
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B0712+472 



intcrnal+external shear ellipsoid+shear 



case 


Xtot 




F-test 




Xflux 


F-test 


abed 


7.29 


1.49 


= 1.00 


18.83 


9.37 


= 1.00 


abc 


7.06 


1.41 


0.85 


14.59 


7.94 


0.20 


abd 


4.51 


0.98 


0.08 


10.34 


7.00 


0.06 


acd 


6.59 


2.96 


0.50 


9.42 


2.26 


0.04 


bed 


6.48 


0.63 


0.44 


6.06 


0.03 


0.02 


ab 


2.88 


1.20 


0.06 


2.22 


1.63 


0.01 


ac 


6.48 


2.95 


0.49 


3.46 


0.01 


0.02 


ad 


2.42 


0.73 


0.05 


2.02 


0.02 


0.01 


be 


6.44 


0.65 


0.47 


4.10 


0.00 


0.03 


bd 


4.48 


0.99 


0.14 


2.88 


0.02 


0.02 


cd 


0.72 


0.11 


0.01 


0.98 


0.02 


0.01 



Table 5. Modeling results for B0712+472 using both macromodels. The F-test gives the probability that \ 2 has decreased (relative to 
the abed model) only because the model has fewer constraints, rather than because the fit is significantly better. 



2003; Gau di et alJ l2005l. The difference s uggests that the opti cal flux ratios are affected by microlensing. We focus on the 
radio flux ratios, and fit the data given bv I Jackson et all teoool) . As shown in Table |5] attempting to fit all four radio fluxes 
yields \ 2 l v = 18.77/4 for an ellipsoid+shear macromodel, or x 2 / v = 7.29/4 for an internal+external shear macromodel. The 
internal+external shear model differs from the data at only 88 per cent confidence, so it is a reasonably good fit. Nevertheless, 
it is interesting to consider whether the fit can be improved by relaxing the flux constraints, as shown in Table Among 
three-flux models, only ABD yields a noticeably better fit. Among two-flux models, only the AD case yields a reasonable 
model that gives a better fit. (The AB model can be ruled out because it requires the negative parity image D to be magnified 
by a factor of ~20 relative to the macromodel; while the CD model can be ruled out because it requires the positive parity 
image A to be demagnified by a factor of ~2.) 

Although we see that relaxing flux constraints lowers the x 2 > we must ask whether that really provides evidence for 
substructure, or whether it just indicates that we are using fewer constraints. The test for statistical significance when 
removing degrees of freedom is called the F-test (e.g.. lBevington fc Robinsonlll992l) . The F-test returns a probability that the 
change in \ 2 is due simply to the change in the number of degrees of freedom - so a low value of the probability indicates that 
the fit really has improved. The test results are given in Table |5J and they confirm our intuition that many of the three-flux 
and two-flux models are not significantly better than the ABCD model. However, the ABD and AD models have relatively 
low F-test values (0.08 and 0.05, respectively), so we conclude that there is marginal evidence for a radio flux ratio anomaly, 
and if real it is probably in image C. 



5 CONCLUSIONS 

We have developed a semi-analytic formalism for computing the effects of substructure on the lensed images of a finite-size 
source. By considering the local effects of a clump modeled as an isothermal sphere, we can solve analytically for the perturbed 
micro-image(s), and then compute numerically the change in the position and magnification of the macro-image. While this 
is a simplified toy model, it yields valuable insight into the general features of finite source effects in substructure lensing: 

• The perturbations do not have a simple dependence on source size, but are related to intersections of the source with 
micro- caustics. 

• Positive parity images are always amplified by isothermal clumps, but negative parity images may be either amplified or 
suppressed depending on the source position and size. 

• Sources that are more than an order of magnitude larger than the clump Einstein radius can still be perturbed at the 
percent level, which mildly contradicts conventional wisdom that a source cannot 'feel' lensing structure on scales smaller 
than itself. 

• Statistical uncertainties in the macromodel do not significantly affect the substructure analysis. Remarkably, the mass 
sheet degeneracy in the macromodel has no effect on the substructure analysis, at least for isothermal clumps. 

• Astrometric perturbations could be at the few milli-arcsec level, and could be identified by comparing the relative image 
positions at different wavelengths. 
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The bottom line is that there is a tremendous amount to be learned from high-resolution observations at a variety of wave- 
lengths that correspond to different source sizes. The promising possibilities are observations of the optical, mid-IR, and radio 
continua, and the optical emission lines. (Detailed X-ray observations seem less valuable, because the source will be much 
smaller than the caustics for millilensing.) The first steps in this direction have been taken ijAgol et alJl200fi Iwisotzki et, alJ 
120031 : iMetcalf et alJl2004h . but a more concerted effort to do this for flux ratio anomaly lenses is called for. 

It is already possible to place limits on the substructure mass scale: since there is a finite range of magnifications possible 
for a given ratio of the clump Einstein radius b to the source size a, an observed flux perturbation leads directly to a lower 
bound on b/a (even with no knowledge of the relative positions of perturber and source). Adding knowledge of (or assumptions 
about) the source size then leads to a lower bound on the clump mass. These substructure bounds do depend on our assumption 
that each flux ratio anomaly is caused by a single, isolated, isothermal clump; how they change for different clump models 
and for the limit of moderate or high optical depth will be the subject of a follow-up study. Still, the principle that finite 
source effects permit simple lower bounds on the substructure mass scale should be general. 

With this background, we have sought to understand three known lensing systems with strong flux ratio anomalies at 
radio wavelengths (BI422+23I, BI555+375, and B2045+265), plus one system with marginal evidence for a radio flux ratio 
anomaly (B0712+472). We carefully examined macromodels consisting of an isothermal lens galaxy with different types of 
angular structure, in order to determine which of the lensed images are perturbed and by how much. Assuming isothermal 
clumps, we could then use our substructure analysis to place lower bounds on the clump masses. For BI422+23I, we find strong 
evidence for a clump in front of image A, and the mass within the clump Einstein radius must be M > 2 x I0 3 (a/10 pc) 2 Mq. 
For B2045+265 and Bf 555+375, we find strong evidence for clumps in front of two images in each systems. The masses within 
the Einstein radii are M > f0 4 -I0 7 Mq, which generally agreees with ACDM predictions, although it is important to consider 
whether ACDM predicts enough clumps to explain the presence of multiple anomalies in multiple lenses. In B07I2+472, there 
is marginal evidence for a clump in front of image C. We emphasize that our identification of the images that are perturbed 
is independent of assumptions about the nature of substructure; those assumptions enter only when we derive quantitative 
clump mass bounds. 

To round out our analysis, we have considered several systematic effects in the substructure mass bounds. As noted 
above, statistical uncertainties in the macromodel propagate into the substructure analysis, but their effects are small. In 
many lensing applications the main problem is the mass sheet degeneracy in the macromodel, but we have shown that it has 
no effect on the substructure analysis, at least for isothermal clumps. Thus, it turns out that the main systematics are the 
uncertainty in the source size, and lack of knowedlge about whether the clump lies within the main lens galaxy (a standard 
assumption) or elsewhere along the line of sight. Varying the clump redshift over a reasonable range can change the clump 
mass bounds by a factor of a few. So it is certainly important for detailed quantitative results, but not so important for order 
of magnitude reasoning. 

Ultimately, using flux ratio anomalies to test ACDM and draw conclusions about the nature of dark matter relies on sophis- 
ticate d statistical analyses with realistic clump models (e.g.. lDalal fe Kochanekll2002l ; IrCochanek fc Dalalll2004l : iMetcalf et alJ 
2004). Toy models like ours are still valuable, though, because they reveal and clarify the general principles on which the 
sophisticated analyses are based. For example, our results suggest that looking for effects requiring comparable scales for 
the source size and the lensing substructure will be the best way to distinguish the substructure explanation for flux ratio 
anomalies from competing hypotheses that may be disfavored but not yet ruled out. Furthermore, we believe that a detailed 
understanding of flux ratio anomalies in individual lenses will always be an important complement to 'black box' statistical 
machinery. 
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APPENDIX A: BEHAVIOR AT LARGE SOURCE SIZE 



Al Taylor series expansion of the magnification 

We noted in H2. 41 that at large a the magnification appeared to be roughly independent of source position. We now confirm 
rigorously that it is independent of source position to first order in 1/a. First, we define some quantities to simplify the 
notation: 

ci = 2w (l - « - 7) cos6» + 26(1 - « - 7cos2#) + 2v (l - k + 7) sin# 
C2 = 6 2 + u 2 , + Vo + 2 6 no cos 8 + 2 b vo sin 6 

C3 = 4[«o(l — K — 7) cosO + 6(1 — k — "/cos29) + vq(1 — k + 7) sinf?] 2 
dr = 4[(1 - k) 2 + 7 2 - 2 7 (1 -«) cos 26>] 

Note that ci, C2, and C3 all depend on the source position, while di does not. With these definitions, we can write 
r±(0) 2 



^ (?ci ± V(c3-c 2 di)? 2 + di) 



(Al) 



where £ = 1/a. We immediately see that as a — » cxa (^ — + 0), r_/a is negative while r+/a is positive and finite, so the image 
boundary is formed only by r+, and the magnification 7V( oc J[r+(0)/a] 2 dd remains finite. Expanding in £, we find: 



M 



(l 



2tt 

„\2 



' 7 



1 + C 



2tt 

(I-.) 2 



dO 



dO 



'7 



2tt 



26(1 — k — 7 cos 2^) 



de + o(C) 



(A2) 



[(1 - k) 2 + 7 2 - 2 7 (1 - k) cos 2f5] 3 / 2 

where in the last step we used the fact that any periodic function whose period is an odd multiple of n integrates to zero. 
The zeroth order term shows that a sufficiently large source is insensitive to the perturber; it has a normalized magnification 
M w 1, meaning that it only feels the convergence and shear field. In the first order term, the integral cannot be evaluated 
analytically, but the important result is that it does not depend on the source position («o, vo)- Thus, to first order in 1/a, the 
magnification is independent of source position. Carrying the expansion further reveals that uo and vo enter only at second 
order in 1/a. 



A2 Substructure limits at large source size 

The series expansion of the normalized magnification can be combined with eq. I23H to obtain an analytic result for the 
minimum size of the perturbing mass from a single flux measurement. The expansion has the form A^oba ~ 1 + C(b/a) where 
C is a number that depends on k and 7. Using this in the definition of the substructure x 2 > we And that the upper and lower 
bounds on 6 can be written as 

b± w g (Mobs - 1 ± a obs Vx 2 ) , (A3) 

where yx^ indicates the confidence level desired (e.g., y/x^ = N for N-a). Fig. IA1I compares this analysis to the exact \ 2 
analysis used in the main paper, and shows that it recovers reasonably accurate lower limits on 6 even for sources as small as 
a/6 ~ 5. 



26 Gregory Dobler & Charles R. Keeton 




0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 2.5 

perturber size, b perturber size, b 



Figure Al. \ 2 as a function of perturber size for various a. The crosses represent the exact analysis (holding the source position 
fixed), while the solid lines represent the analysis using the first order Taylor series expansion of the model magnification. The expansion 
analysis gives approximately the correct 1<t lower limit on b for a > 5. 



